The standard error of measurement is a more appropriate measure of quality for postgraduate medical assessments than is reliability: an analysis of MRCP(UK) examinations

نویسندگان

  • Jane Tighe
  • IC McManus
  • Neil G Dewhurst
  • Liliana Chis
  • John Mucklow
چکیده

BACKGROUND Cronbach's alpha is widely used as the preferred index of reliability for medical postgraduate examinations. A value of 0.8-0.9 is seen by providers and regulators alike as an adequate demonstration of acceptable reliability for any assessment. Of the other statistical parameters, Standard Error of Measurement (SEM) is mainly seen as useful only in determining the accuracy of a pass mark. However the alpha coefficient depends both on SEM and on the ability range (standard deviation, SD) of candidates taking an exam. This study investigated the extent to which the necessarily narrower ability range in candidates taking the second of the three part MRCP(UK) diploma examinations, biases assessment of reliability and SEM. METHODS a) The interrelationships of standard deviation (SD), SEM and reliability were investigated in a Monte Carlo simulation of 10,000 candidates taking a postgraduate examination. b) Reliability and SEM were studied in the MRCP(UK) Part 1 and Part 2 Written Examinations from 2002 to 2008. c) Reliability and SEM were studied in eight Specialty Certificate Examinations introduced in 2008-9. RESULTS The Monte Carlo simulation showed, as expected, that restricting the range of an assessment only to those who had already passed it, dramatically reduced the reliability but did not affect the SEM of a simulated assessment. The analysis of the MRCP(UK) Part 1 and Part 2 written examinations showed that the MRCP(UK) Part 2 written examination had a lower reliability than the Part 1 examination, but, despite that lower reliability, the Part 2 examination also had a smaller SEM (indicating a more accurate assessment). The Specialty Certificate Examinations had small Ns, and as a result, wide variability in their reliabilities, but SEMs were comparable with MRCP(UK) Part 2. CONCLUSIONS An emphasis upon assessing the quality of assessments primarily in terms of reliability alone can produce a paradoxical and distorted picture, particularly in the situation where a narrower range of candidate ability is an inevitable consequence of being able to take a second part examination only after passing the first part examination. Reliability also shows problems when numbers of candidates in examinations are low and sampling error affects the range of candidate ability. SEM is not subject to such problems; it is therefore a better measure of the quality of an assessment and is recommended for routine use.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Changes in standard of candidates taking the MRCP(UK) Part 1 examination, 1985 to 2002: Analysis of marker questions

BACKGROUND The maintenance of standards is a problem for postgraduate medical examinations, particularly if they use norm-referencing as the sole method of standard setting. In each of its diets, the MRCP(UK) Part 1 Examination includes a number of marker questions, which are unchanged from their use in a previous diet. This paper describes two complementary studies of marker questions for 52 d...

متن کامل

Obtaining the MRCP diploma – difficult Olympic hurdles or a straightforward triple jump?

The Royal Colleges of Physicians of the United Kingdom have a common membership examination in general medicine and successful candidates are eligible for the award of the MRCP (UK) Diploma1. This important postgraduate qualification is achieved after passing three separate examinations typically known as MRCP Part 1, MRCP Part 2 and MRCP PACES. Attaining the MRCP (UK) Diploma or “full membersh...

متن کامل

The Reliability of Bubble Inclinometer and Tape Measure in Determining Lumbar Spine Range of Motion in Healthy Individuals and Patients

Purpose: Regarding the high prevalence of low back pain in various communities and the need to determine an appropriate treatment plan for these patients, examining their functional limitation and disability level is of utmost importance. In this regard, one of the important indicators is Lumbar range of motion. Measurement of the range of motion is a common and appropriate method for determini...

متن کامل

Test-retest reliability of Motricity Index strength assessments for lower extremity in post stroke hemiparesis

  Background: The Motricity Index was used to measure strength in upper and lower extremities after stroke. The weighted score based on the ordinal 6 point scale of Medical Research Council was used to measure maximal isometric muscle strength. There is dearth of articles dealing with the reliability of this method. Therefore, the aim of this study was to determine the test retest reliability o...

متن کامل

Comparison of Graduate Medical Education in Iran with WFME International Guidelines: Quality Improvement in Postgraduate Medical Education

In 2001, following the development of International Standards in basic medical education, WFME appointed an international Task Force for development of International Guidelines for Postgraduate Specialist Training. Reports of this Task Force were published in September 2001. These Guidelines has been structured in 9 areas and 37 sub-areas. The areas of these guidelines are mission & outcomes, t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2010